home *** CD-ROM | disk | FTP | other *** search
- Path: news.ucdavis.edu!quad!knight
- From: knight@quad.cs.ucdavis.edu (James Knight)
- Newsgroups: comp.lang.c,comp.unix.programmer
- Subject: Re: Q: '\n' character
- Followup-To: comp.lang.c,comp.unix.programmer
- Date: 16 Apr 1996 21:37:34 GMT
- Organization: University of California, Davis
- Message-ID: <4l13uu$mva@mark.ucdavis.edu>
- References: <4kj66f$k0o@ren.cei.net> <AD97189A966891F2@mcdiala02.it.luc.edu> <4ktn04INNoev@keats.ugrad.cs.ubc.ca> <4ku8f9$d3o@mark.ucdavis.edu> <4kumbqINNgcr@mayne.ugrad.cs.ubc.ca>
- NNTP-Posting-Host: quad.cs.ucdavis.edu
- X-Newsreader: TIN [version 1.2 PL2]
-
- Kazimir Kylheku (c2a192@ugrad.cs.ubc.ca) wrote:
- : In article <4ku8f9$d3o@mark.ucdavis.edu>,
- : James Knight <knight@quad.cs.ucdavis.edu> wrote:
-
- : This is a good effort: I will try to look for any marginal improvements.
-
- : > if ((buffer = realloc(buffer, bufsize)) == NULL)
- : > return NULL;
-
- : Just one quip: when realloc() fails, the original data is not lost. So you have
- : to keep the original pointer around, and be ready to either leave the data as
- : it is, or free() it.
-
- It depends on which realloc is used. The man page for the default realloc on my
- system (Ultrix 4.3A) says that the block may be destroyed. And, I'll admit that
- the one major deficiency of this procedure is that it does not distinguish
- between EOF, Read Errors and Malloc/Realloc errors. I didn't do that, because
- the error handling really depends on how the rest of the program wants to
- handle errors. Should it use a local "errno" value to signal an error, should
- it just print an error message and exit, or should it do something else? Once
- the error handling of the program is determined, then the error handling of
- the function can be set.
-
- If you needed a truly robust version and the rest of the program was both
- aware of the memory error and was able to free up memory, then you could add
- a static memory interrupt flag to the function that was set when the memory
- error occurs and when the function is called after it was set, skips the
- initial allocation and read and moves the computation into the while loop
- again. There's no point in the function trying to free anything from inside
- the function, because all of the memory is needed to retain the portion of
- the current line.
-
-
- : > /*
- : > * Strip the newline from the line, if it's there.
- : > */
- : > if (buffer[len-1] == '\n')
- : > buffer[--len] = '\0';
- : >
- : > if (len_out) *len_out = len;
- : > return buffer;
-
- : Ah, you forgot to adjust the buffer size for the actual length read! If the
- : line is 128K plus one byte, you will return 256K---the next higher power of
- : two. No biggie, but it's easy to fix with a realloc down to the actual size.
- : I'm not sure how paranoid one ought to be when checking the result of a
- : _shrinking_ realloc, but I'd treat it the same was as a growing one to be safe.
-
- Except for one point, does it really matter whether the allocated array is longer
- than the line? The rest of the program should never touch the memory after the
- array (or free/realloc the array), so it shouldn't matter to the program how long
- the array is. The only exception is that if a really long line is read, and then
- shorter lines are read, then the function should probably shorten its length to
- free up as much space as possible. But, on machines today, this only becomes
- a problem when you get near the MB range, and for the purpose I was using it for
- (namely, reading typed in lines and files where a limited line length was assumed)
- it was never that big a problem.
-
-
- : Nevertheless, my original comments about fgets() versus loops that use getc()
- : apply: the above might be quite a bit cleaner if you were content to spoon out
- : a character at a time. No strlen(), no odd buffer manipulation---just a pointer
- : that you advance, and check against the buffer bounds.
-
- Yes, but which one is faster. I did this experiment for a class I was teaching,
- where I compared all combinations of fgetc, fgets, fputc and fputs for a cat
- program that does line by line reading. The fgetc/fputc combination ran in
- about 10 seconds (I think I was reading 1MB of text), the fgets/fputc and
- fgetc/fputs combinations ran in about 6.5-7 seconds and the fgets/fputs
- combination ran in 3.5 seconds. I can post more details about this, if you
- like.
-
-
- : What about dealing with null characters in the input lines?
-
- I didn't worry about those. I've never seen a text file that had null
- characters, and most of the text writing programs (text editors and the basic
- Unix utilities) won't insert any null characters into files. So, I didn't
- worry about it.
-
- Jim
-
-